

**IN THE CLAIMS**

Please cancel claims 1-13.

For the Examiner's convenience, a list of all claims is included below.

1.-13. (Canceled)

14. (Currently Amended) An apparatus comprising: a first input to receive a first packed data comprising at least four data elements a second input to receive a second packed data comprising at least four data elements; a multiply-adder circuit, responsive to a first instruction, to multiply a first pair of data elements of the first packed data by respective data elements of the second packed data and to generate a first result representing a first sum of products of said multiplications of said respective data elements with said first pair of data elements, and to multiply a second pair of data elements of the first packed data by respective data elements of the second packed data and to generate a second result representing a second sum of products of said multiplications of said respective data elements with said second pair of data elements; and an ~~output~~output to store a third packed data comprising at least said first and said results in response to the first instruction.

15. (Original) The apparatus of claim 14 wherein said first and second packed data each contain at least eight data elements.

16. (Original) The apparatus of claim 15 wherein said first and second packed data each contain at least 64-bits of packed data.

17. (Original) The apparatus of claim 15 wherein said first and second packed data each contain at least 128-bits of packed data.

18. (Original) The apparatus of claim 17 wherein said first and second packed data each contain at least sixteen data elements.

19. (Original) The apparatus of claim 17 wherein the first packed data comprises unsigned data elements.

20. (Original) The apparatus of claim 17 wherein the second packed data comprises signed data elements.

21. (Original) The apparatus of claim 20 wherein the first packed data comprises unsigned data elements.

22. (Original) The apparatus of claim 21 wherein the first and second results are generated using signed saturation.

23. (Original) The apparatus of claim 14 wherein the first and second results are truncated.

24. (Original) A computing system comprising: an addressable memory to store data; a processor including: a first storage area to store M packed data elements, the first storage area corresponding to a first N-bit source; a second storage area to store M packed data elements, the

second storage area corresponding to a second N-bit source; a decoder to decode a first set of one or more instruction formats having a first field to specify the first N-bit source and a second field to specify the second N-bit source; an execution unit, responsive to the decoder decoding a first instruction of the first set of one or more instruction formats, to produce M products of multiplication of the packed data elements stored in the first storage area by corresponding packed data elements stored in the second storage area, and to sum the M products of multiplication pairwise to produce M/2 results representing M/2 sums of products; and a third storage area to store M/2 packed data elements, the third storage area corresponding to a N-bit destination specified by the first instruction to store the M/2 results; and a magnetic storage device to store said first instruction.

25. (Original) The computing system of claim 24 wherein N is 128.

26. (Original) The computing system of claim 25 wherein M is 16.

27. (Original) The computing system of claim 24 wherein N is 64.

28. (Original) The computing system of claim 28 wherein M is 8.

29. (Original) The computing system of claim 28 wherein said M packed data elements of the first storage area are treated as unsigned bytes.

30. (Original) The computing system of claim 29 wherein said M packed data elements of the second storage area are treated as signed bytes.

31. (Original) The computing system of claim 30 wherein each of said M/2 results are generated using signed saturation.